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(54) Method of encoding digital audio signals 

(57) The method of encod^g digital data m the 
present invention enatDles change of minimum limit of 
audibility characteristics and/or masking characteristics, 
which are usually set on the l>asis of the aural-psycho- 
logical cliaracteristics of persons with typical hearing. 



thus changing the alk)cation of quantized bits to each 
frequency band and allowing selection of a sound qual- 
ity which accords with the listener's hearing. The 
present invention is suitable for ATRAC, a method fbr 
compressed encoding for mini-discs. 



FIG.1 



POWER 




0(2 (X15 



PrInMd by XaroK (UK) Business Servioeft 
2. 16.3^4 



1 



EP0 855 805 A2 



2 



Description 

FIELD OF THE INVENTION 

The present Invention relates to a method of encod- 
ing digital data in which, when recording musical tones, 
sounds, etc. in recording media such as mini-discs, bits 
are allocated to the spectrum of each frequency t>and in 
response to the musical tones, sounds, etc. so as to 
conrpress data volume. 

BACKGROUND OF THE INVENTION 

One method of highly efficient compressed encod- 
ing of digital data such as musical tones and sounds is 
ATRAC (Adaptive Transfonm Acoustic Coding), used in 
mini discs. In ATRAC, since the digital data is com- 
pressed with high eff ciency. it is first broken down into a 
plurality of frequency bands, then divided into blocks in 
accordance with time units of variable length, trans- 
formed into spectral signals by MDCT (Modified Dis- 
crete Cosine Transform) processing, and then each 
spectral signal is encoded by the number of quantized 
bits which have been allocated to it taking into account 
aural-psyclK)logicaJ characteristics. 

Anx)ng the aural-psychological characteristics 
which can be applied to the compressed encodng are 
kxjdness-level characteristics and masking effect 
Ljoudness-level characteristics show that even witti the 
same sound pressure level, the foudness of a sound 
sensed by a person changes according to the frequency 
of the sound. Accordingly, this shows that the mtninujm 
limit of audbifity, whfoh shows the smallest foudiess 
which can be heard by a person, changes according to 
the frequency. As for masking effect, there are two 
kinds: simultaneous masking effect and elapsed mask- 
ing effect. Simultaneous masldng effect is a phenome- 
non in which, when several sounds of different 
frequency composition occur simuttaneously, one 
sound makes anottier difficult to hear. Elapsed masking 
effect is a phenomenon in which the masking occurs 
before and after a loud sound afong the time axis of the 
k)ud sound. 

An example of conventional art whk:h makes use of 
tlie elapsed masking effect is Japanese Unexamined 
Patent Publication No. 5-91061/1993, In tills conven- 
tfonal art when a transient signal is included in one of 
the frequency conversion time units, bits are allocated in 
accordance with a woid length which varies depending 
on the energy of previous time inits and on the amount 
of masking, thereby preventing a sound quality deterio- 
ration called "pre-echo.* Again. Japanese Unexamined 
Patent Publication No. 5-248972/1993 proposes a tech- 
nique for improving the efffoiency of encodng by using 
elapsed masking in reference to the spectral distribution 
of prevkxis time units. 

Another example of bit alfocation using tfie aural- 
psychological characteristics is one called the repetitfon 



metiKxj. in wtuch actual bit allocation suited to input dig- 
ital data is performed as follows. First the power S of 
each frequency band, and the masking threshokJ M of 
that power S on tiie ottier frequency bands, are found. 

5 Next from the masking threshold M and the power of 
quantized noise N(n) (when each frequency band is 
quantized into n bits), is calculated the ratio of the mask- 
ing threshold to noise, being MNR(n) = M/N(n) . Then, 
after bit allocation for the frequency band with the small- 

10 est ratio of masking threshold to noise MNR(n), that 
ratio of masking threshokJ to noise MNR(n) is re-calcu- 
lated, and bits are allocated to tiie frequency band witii 
the lowest ratio. 

Note tiiat ttie aural characteristtes of persons wHh 

IS typical aural characteristics are ttie model for tiie mini- 
mum limit of audibility, masking tfireshoM. etc. men- 
tioned above. Accordingly, there are cases where 
listeners will feel a sense of incongruity due to differ- 
ences in hearing or preference. 

20 For example, in cases where tiie spectral composi- 
tion of ttie input digital data is comparatively flat, like 
white noise, bit allocation will be made with ttie masking 
threshofo at the minimum limit of audbility, so most of 
the quantized knts will be alfocated to the mid- to ksw- 

2$ range. Accordingly, depending on the size of the spec- 
tral composition, quantized bits may not t>e allocated to 
ttie ultra-low and ultra-high ranges, giving some fisten- 
ers a sense of incongruity. 

Again, when ttte input digital data is a composite 

30 wave composed of a signal witti a narrow spectrum 
band (such as a sine wave signaO and white noise, the 
frequency bands f 1 which include tiie sine wave signal 
will have nrx^re power, but as for frequency bands 12 
whfoh are far from ttie frequency bands f1. tiie farther 

35 from ttie frequency bands f1. ttie greater ttie drop \n 
power. Accordingly, tfiere wfll be almost no masking 
from ttie sine wave signal at a frequency tsand f2. and 
ttie influence of masking from the power of tiie fre- 
quency t>and f2 itself is inaeased. Because of this. 

40 there will be no great difference between the ratio of sig- 
nal to masking ttireshold (SMR: tiie ratio of a frequency 
band's own power S to masking threshold M) at the fre- 
quency bands f1 and the same ratio SMR at ttie fre- 
quency t>andsf2. 

45 In otiier words, if the power of a signal is S. and the 
power of quantized noise is N(n) when each frequency 
band is quantized into n bits, then, based on the relative 
relationship between tiie two. the ratio of masking 
ttireshold to noise MNR(n) = M/N(n) « S/N(n))/(S/M(n)) 

so will be approximately the same value at the frequency 
bands f1 and f2. Accordingly, since the conventfonal 
adaptive bit alfocation methods perform bit alfocation 
based only on ttie ratio of masking ttveshoU to noise 
MNR(n). their drawback is ttiat approximately the same 

55 nuni>er of bits are allocated to ttie frequency bands f 1 
andf2. 

As a result, if ttiere are many frequency bands f2 
which are not influenced by ttie masigng from tiie sine 
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wave signal . the nurrber of bits ailcKated to the fre- 
quency bands f1 which include the sine wave signal 
becomes relatively snialler. the quantization en^or of the 
sine wave signal becomes greater, and sound quality 
deteriaates. 

In regard to this point, the present Applicant has 
proposed, in Japanese Unexamined Patent Publication 
7-202823/1995. a structure wNch automatically limits 
the nuni)er of bits which may be allocated to frequency 
bands with low power S. However, a drawback of this 
conventional art is that, since the maximum number of 
bits which may be allocated to each frequency band is 
determined on the t>asis of its power, when the power of 
white noise is large, there are cases when no limitation 
on bit allocation to that frequency band Is made. 

SUMMARY OF THE INVENTION 

One object of the present Invention is to provide a 
method of encoding digital data capable of attaining a 
sound quality which accords with the listener^ hearing. 

Another object of the present invention is to provide 
a method of encoding digital data capable of preventing 
deterioration of sound quality even of signals with nar- 
row spectrum bands. 

In order to realize the first object mentioned above, 
the first method of encoding digital data of the present 
invention encodes digital data such as musical tones 
and sourxis by converting it into frequency domains, 
cfividing the converted spectra into a plurality of fre- 
quency bands, changing a minimum limit of audibility 
characteristic so as to set a masking threshold, and allo- 
cating quantized bits for each frequency band in accord- 
ance wHh ratios of masking threshokJ to noise which are 
found for each frequency band in accordance with 
power or energy of each frequency band in conskJera- 
tk>n of aural-psychok)gical characteristics. 

Tlie above structure, by enabling change of the 
minimum limit of audibility characteristrc among aural- 
psychotogtcal characteristics, frees aural-psyctiolo^cal 
cf^racteristics from definition t>y the cfiaracteristics of 
persons with typical hearing, and makes possftsle selec- 
tion of whether or not to alk>cate bits to spectra with 
small inaudit)le domains, or spectra with ultra-k>w or 
utfra-hlQ^ domains. Aooofdingly, it becomes possible to 
respond to persons with siperior hearing or to individ- 
ual. sut)jective preference, and sound quality which 
accords with listeners* hearing can be attained. 

Next, in order to realize the first object mentioned 
atxyve. the second method of encoding digKal data of 
the present invention encodes digital data such as musi- 
cal tones and sounds by converting it into frequency 
domains, divkjing the converted specfra into a plurality 
of frequency bands, changing a masMng characteristic 
so as to set a masking threshokf. and allocating quan- 
tized bits for each frequency band in accordance with 
ratk)s of the masking threshold to noise for each fre- 
- quency band which are found in accordance with power 



or energy of each frequency tsand in consideration of 
aural-psychological cf^racteristk;s. 

The above sfructure. by enabling change of the 
masking characteristic among the aural-psychok)gkal 

5 characteristics, frees aural-psychological characteris- 
tics from definition by the characteristics of persons with 
typical hearing, and makes possible selectk)n of 
whether to allocate bits, for example, to spectra which, 
for example, suffer masking in a aitical band. Acoord- 

10 ingly. it becomes possible to respond to persons with 
superkx hearing or to indivkJual. subjective preference, 
and sound quality wNch accords with listeners* hearing 
can be attained. 

Next. In order to realize the first object mentioned 

IS above, the third method of encoding digital data of the 
present invention encodes digital data such as musical 
tones and sounds by converting it into frequency 
domains, dividing the converted spectra into a plurality 
of frequency bands, and switching among (i) bit alloca- 

20 tion in accordance with ratk)s of masking threshokl to 
noise which are found for each frequency band in 
accordance with power or energy of each frequency 
t>and in consideratk>n of aural-psycfidogical character- 
istics, (ii) bit allocation in accordance with a representa- 

25 five value of the power or tlie energy of each frequency 
band, and (iii) bit allocation giving weight to each of the 
foregoing bit allocation methods. 

With respect to data, such as white noise having a 
spectral composition which is comparatively flat the 

30 atx>ve sfructure makes possitsle bit aOocation which is 
flat along the frequency axis. Again, with respect to 
data, such as sine wave signals, with narrow band 
wkfth. the at>ove structure makes possible bit aDocation 
which emphasizes the signal with narrow band width. 

35 Accordingly, selection of a sound quality which is suited 
to the source of the musical tone is made possible. 

Finally, the frxjrth method of encoding digital data of 
the present invention, in order to realize the second 
object mentioned above, switches among bit altocation 

40 methods (i). (a), and (iii) described in the third method of 
encoding digital data in accordance with a relationship 
between the masking threshoM and peaks and focal 
peaks found based on differences in power or energy 
between adjacent spectra within each frequer^y band. 

45 The above structure makes it possSsle to automati- 
cally allocate bits according to the method most suited 
to the digital data, whether it is white noise or other data 
with wkJe bavd width, or sine wave signals or other data 
with narrow band width, thus preventing deterioration of 

so sound quality, even with musfoal tones not suited to bit 
alfocation using simultaneous masking such as the 
masking threshoki/hoise ratfo. 

The other objects, features, and superior points of 
the present invention will be made dear by the desaip- 

55 tion below. Further, the advantages of this invention will 
be evident from the following explanation in refererice to 
the Figures. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a frequency spectrum cfiagram tor 
explaining the method of encoding according to the first 
enixxjiment of the present invention. s 

Figure 2 is a block diagram showing the electrical 
structure of a mini<dtsc recording and reproduction 
device, which is one example of application of the 
present invention. 

Figure 3 is a flow-chart for explaining the brt-alloca- io 
tion method according to the first embodiment of the 
present invention. 

Figure 4 is a flow-chart for explaining the bit-alloca- 
tion method according to the second embodiment of the 
present invention. is 

Rgure 5 is a flow-chart for explaining the bit-alloca- 
tion method according to the third en^xxiiment of the 
present invention. 

Figure 6 is a frequency spectrum diagram for 
explaining operations fa detection of peaks and local 20 
peaks in the bit-allocation method shown in Figure 5. 

DESCRIPTION OF THE EMBODIMENTS 

The first embodiment of the present Invention will 25 
be explained below, in reference to Figures 1 through 3. 

Figure 1 is a frequency spectrum diagram for 
explaining the method of encoding digital data accord- 
ing to the first embodiment of the present invention, and 
Figure 2 is a block diagram showing the electrical struc- 30 
ture of a ntini-disc recording and reproduction device 1 . 
wtiich is one example of application of the present 
invention. Rrst. in reference to Figure 2. tiie mini-disc 
recording and reproduction devtee 1 will be explained. 
First, digital data, for example in the form of fight signals. 35 
is serially inputted to an input terminal 2 from a digital 
audio signal source (not shown) such as a compact disc . 
reproduction device or a satellite broadcast receiver. 
After the light signals are converted into electric signals 
by a photoelectric element 3, they are sent to a digital 40 
PiL circuit 4. TYie digital Pli. circuit 4 extracts the dock 
from the digital data, and recreates multtolt data corre- 
sponding to the sampling frequency and the number of 
quantized txts. Next, in a frequency conversk>n circuit 5. 
tiie muttibit data undergoes sampling rate conversion to 45 
the 44.1 kHz conforming to the mini-disc standard from, 
for exanpte. the 44.1 kHz sampling frequency of com- 
pact discs, the 48 kHz sampling frequency of digital 
audk) tape recorders, or the 32 kHz sampling frequency 
of satellite broadcasts (A mode), and is then sent to an so 
audio compression circuit 6. 

The audio compression circuit 6 performs com- 
pressed erxxxling of tfie input data according to the 
foregoing ATRAC method. The encoded audio data is 
sent ttirough a shock-proof memory controller 7 to a sig- 55 
nal processing circuit 8. A shock-proof memory 9 is pro- 
vided In association with the shock-proof memory 
controller 7. In addition to ab80it>ing the difference in 



transfer rates between the audio data outputted from 
the audio compression circuit 6 and tiie audio data 
inputted to the signal processing circuit 8. the shock- 
proof memory 7 also serves to protect the audio data by 
interpolation of any breaks which occur in the playt>ack 
signal due to disturbance such as vibration during the 
playback operation, which will be discussed below. 

The signal processing circuit 8 functions as an 
encoder and decoder, and encodes the audio data as 
magnetic nxxlulation signals before sending it to a head 
driving drcurt 11. The head driving drcuit 11 moves a 
recording head 12 to the desired recading kx^ation on a 
magneto-optical disc 13, and causes ttie recording 
head 12 to emit a magnetic field conesponding to tiie 
magnetic modulation signals. At this time, laser light s 
projected from an optical pidcup 21 onto tfie desired 
recording location on the magneto-optical disc 13, and a 
magnetized pattern conesponding to the magnetic field 
emitted by ttie recording head 12 is formed on the mag- 
neto-optical disc 13. 

During the playt>ack operation, on the otfier hand, 
serial signals corresponding to tiie magnetiziad pattern 
on the magneto-optical disc 13 are reproduced by the 
optical pickup 21 . and after the serial signals thus repro- 
duced are amplified by a high-frequency (RF) amplifier 

22. they are sent to the signal processing drcuit 8 and 
decoded into audio data. After the shock-proof memory 
controller 7 and the shock-proof memory 9 have elimi- 
nated the influence of any disturt>ance on the decoded 
audio data, they are sent to an audio expanskxi crcuit 

23. The audio expansion drcuit 23 perfomts a conver- 
sion process which is the reverse of compressed encod- 
ing accordng to tfie ATRAC metiiod. arid demodulates 
ttie audk) data into full-bit digital audio signals. The 
demodulated digital audio signals are converted into 
analog audio signals by a digital/analog (D/A) conver- 
sion circuit 24. and are then outputted from an output 
terminal 25. 

The serial signals amplified t>y ttie high-frequency 
amplifier 22 are also sent to a servo circuit 31. In 
response to the serial signals whkii have t>een repro- 
duced, the servo drcuit 31 exerts feecfi>ack control on 
the revolution speed of a spin motor 33 through a driver 
drcuit 32. thus enabling reproduction at ttie desired lin- 
ear vebdty. The servo drcuit 31 also exerts feedback 
control on the revolution speed of a feed motor 34. thus 
enabling control of ttie position of ttie optical jpkkup 21 
in the radial direction of ttie magneto-optical disc 13. 
i.e.. control off tracking. Finally, ttie servo drcuit 31 also 
exerts feedback control on the focusing of ttie optical 
picktp21. 

The servo drcuit 31 . ttie optical pidap 21 , the high- 
frequency anrplifier 22. the signal processing drcuit 8. 
and the driver drcuit 32 are energized by a power 
ON/OFF drcurt 35. The power ON/OFF operations of 
ttie power ON/OFF drcuit 35 and ttie signal processing 
operations of tfie signal processing drcuit. which win be 
discussed bek3w. are centrally managed by a system 
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control miCTocomputer 36. In association with the sys- 
tem control miaocomputer 36 is provided an input oper- 
ation means, which enables sound-quality selection 
operations, which will be discussed below, as well as 
song title input, song selection operations, etc. 

Next, the brt allocation method in the first embodi- 
ment of the present invention, which is performed 
according to the ATRAC method by the audio compres- 
sion circuit 6 of the mini-disc recofding and reproduction 
device 1 structured as described above, win be 
explained, reten^ing to Figures l and 3. 

In the ATRAC method, the audio data sampled at 
44.1 kHz. as mentioned above, is divided Into certain 
frequency bands, specifically a Low frequency band 
from 0 kHz to 5.5 kHz, a Middle frequency band from 
5.5 kHz to 11 kHz. and a High frequency band from 11 
kHz to 22 kHz. and the audfo data bridging certain time 
frames for each divided frequency barxl is converted, by 
means of the MDCT processing, into an MDCT coeffi- 
cient which is the data of one frequency domain. The 
MDCT coefficients converted in this manner are then 
converted into spectrum powers Si fbr i nunnber of fre- 
quency bands (i = 1 , 2 I. with I equal to, for example. 

25). Processing like that shown in Rgure 3 is then car- 
ried out to allocate quantized bits in accordance with 
each epectmm power Si thus obtained. 

The audio compresston drcuit 6 tndudes .a table 
ROM 6a. and in the table ROM 6a are stored masking 
characteristics and/or minimum linit of audibility char- 
acteristks accordng to the ATRAC mettxxJ. These vrin- 
imum limit of audibilAy characteristics appear as a curve 
shown by reference symbols a1 . a2, o3. and a4 on Fig- 
ure 1. The masking characteristics. cak:ulated in 
accordance with the spectrum powers Si. a critical band 
width of each frequency band. etc. . appear, for a power 
distribution like that shown in Figure 1. for example, as 
a curve shown by reference symbols all, a12, and 
a13. The n^nimum limit of audibility characteristics 
shown by the reference symbols a1 through a4 and 
masking characteristtos shown by reference symbols 
a1 1 through a13 are prepared in accordance with the 
aural-psychok)gk»l characteristics of persons with typi- 
cal hearing characteristk^. and are fixed cfiaracteris- 
tics. 

However, in the first embodiment of the present 
inventk)n, the minimum limit of audibility and/or the 
masking characteristics can be changed. In concrete 
terms, for example in the case of the masking character- 
istk:8. the greater tfie spectrum power and the higher 
the frequency, the larger the range of masking of other 
frequency bands. In the example in Figure 1, the maxi- 
mum limit Smax of ttie range influenced by spectrum 
power 85. whk:h is a peak power, is shown by 
a 1 3 X (1 ±£k) . Here. £k is a coeff kaent for weighting. If 
a plurality of variables k are staed in advance in the 
table ROM 6a. and the variables k are switched by 
means of a register 36a in the system control microoom- 
puter 36, the masking dwacteristic curve a13 can be 



changed within the range from a14 through a15. The 
variable k can be set by the listener through the input 
operating means 37. 

For example, by changing the masking characteris- 

5 tic curve from a13 to a1 4. the band masked is widened, 
the level of masking is inaeased. and the nuni>er of bits 
allocated to signals with low power is decreased, or 
even eliminated. Accordingly, bit alkx»tion to signals of 
relatively greater power is increased, and the dynamic 

10 range of the high-power signals is increased. K. on the 
other hand, the masking characteristic curve is changed 
from a13 to a15. bit allocation to tow-power signals is 
increased, and bit allocation to signals of relatively 
greater power is decreased. Accordingly, the frequency 

IS range can be enlarged. The same effect can also be 
obtained by giving the masking characteristic curve a13 
an offset Instead of weighting. 

In the same way, with regard to the minimum limit of 
audibiPity characteristics, the minimum limit of audibility 

20 characteristic curve a1 through a4. wtiich is based on 
the aural-psychological characteristks of persons with 
typical hearing characteristics, can be weighted or given 
an offset thereby changing the a4 portkxi of the curve, 
for example, as shown by r^erence symbol oi5. In tfus 

25 way. relatively more bits are altocated to the high-fre- 
quency bands. 

Next, processing for allocation of quantized bits will 
be explained, refening to Rgure 3. Rrst in Step pi. the 
spectrum power Si of each frequency band is calculated 

30 from the sum of squares of the MDCT coeffk^ients for 
that frequency band (which are obtained by means of 
the MDCT processing). In Step p2, the audio compres- 
sion circuit 6 selects, through the register 36a of the 
system control rrvcrooomputer 36, parameters for 

35 change of masWng characteristk^s. such as the varia- 
bles K which are stored in the table ROM 6a. In Step p3, 
in the same way as In Step p2. parameters for change 
of the minimum finvt of audibility characteristks are 
selected. 

40 In Step p4. reference masldng characteristk^s and 
minimum limit of audibility characteristks prevfousty cal- 
culated and stored In the table ROM 6a are changed in 
accordance wKh the parameters selected in Steps p2 
and p3. and these two cfwacteristics are synthesized in 

45 order to determine a final masking threshold. In other 
words, if the minimum limit of audU'ity characteristic 
cun<e thus changed is as shown by the reference sym- 
bols a1. a2. a3. a5. and the masking cfiaracteristk; 
curve thus changed is as shown by the reference sym- 

$0 boHs all. a12. a14. the curve of the final masking 
threshokJ obtained by synthesis will be as shown by the 
reference symbols a1. a12. a14. a3. a5. 

In Step p5. if the index of each frequency band is i. 
the ratio of the frequency band's spectrum power Si 

55 (calculated in Step pi ) to its masking threshoki Mi (cal- 
culated in Step p4) SMRi i Si/Mi is cateulated for all 
frequency bands. On a logarithmic graph, the ratk) 
SMRi for each frequency band wil correspond to that 
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part of the length of the spectrum power Si which 
exceeds the masking threshold Mi. 

Next in Step p6. the ratio of spectrum power Si to 
the power of quantized noise Ni(n), when the spectrum 
power Si of each frequency t>and is quantized into n 
bits, is calculated: SNR(n) » Si/Ni(n) . Statisticaliy. the 
ratio SNR(n) is a constant in accordance with the char- 
acteristics of the signal, so K may be calculated in 
advance k)y statistical processing. From the ratio of the 
ratio SNR(n) to the ratio SMRi can be calculated the 
ratio of masking threshold to the power of quantized 
noise, being MNRi(n) e SNRi(n)/SMRi . 

In Step p7. the quantized bits are allocated to each 
frequency band as follows. The number of t)its n is 
increased from 0. and. at each increase, the ratio of 
masking threshold to power of quantized rK>ise MNRi(n) 
is calculated fa each frequency band, and a bit Is alk)- 
cated to the frequency bard where the ratb MNRi(n) is 
the smallest In this way. each time the number of quan- 
tized bits n is inaeased. a bit is allocated to the fre- 
quency band witii the smallest ratio MNRi(n). and if this 
is repeated until allocation of all available bits is com- 
pleted, the word length of each frequency band Is deter- 
mined, and this is outputted. In other words, bits are 
allocated starting with the frequency band in wNch the 
length of that part of the spectrum power Si exceeding 
the threshold Mi is fongesL 

Thus, bits are allocated in such a way that the 
masking threshold, as shown in Rgure 1 . is changed to 
accord with the listener's preference. 

The foregoing has desaibed the case of change of 
botti the masking characteristics and tiie minimum Emit 
of audibility characteristics, but ttie present invention is 
not limited to such a case; either tiie masking character- 
istics or the minimum-€UKlit)ility characteristics may be 
changed alone. 

In short change of ttie minimum limit of audibility 
characteristics alone, for example, makes it possble to 
select whettier or not to allocate bits to small spectra in 
the inaudble range or spectra in tiie ultra-low oc litra- 
high ranges. Again, change of the masking characteris- 
tics only, since it entails change of masking characteris- 
tics which are determined by tfie critical bands in 
accordance with the power and the frequency of each 
frequency band, makes it possible to select whetfier or 
not to altocate bits to spectra which are masked by 
spectra witti comparatively Ngh power, in this way, 
sound quality which accords witti ttie hearing of each 
listener can be obtained. 

The second embodiment of ttie present invention 
win be explained below, in reference to Figure 4. 

Figure 4 is a fkyw chart for esqslaining the bit alloca- 
tion mettiod in the second enixxliment of ttie present 
inventfon. The notable feature of ttiis bit allocation 
method is that it is possisle to set a desired percentage 
X between (a) ttie bit allocation according to ttie ratio of 
masking ttireshold to quantized noise MNRi(n) and (b) 
the bit allocation according to ttie power off quantized 



noise SIMi(n). when ttie spectrum power Si, wfuch is a 
representative value of the power or energy of each fre- 
quency band, is quantized into n bits. Several of the per- 
centages X of (a) to (b) are stored in advance in ttie 

5 table ROM 6a of the audio conpressbn circuit 6. and 
selection among ttie different percentages x can be per- 
formed, tfirough ttie register 36a of ttie system control 
microcomputer 36. in response to ttie operations from 
ttie input operating means 37. 

10 In concrete terms, first, in Step p1 1, in the same 
way as In Step pi in ttie first embodiment ttie spectrum 
power SI of each frequency band is calculated from ttie 
sum of squares of ttie respective MDCT coeffictents. In 
Step p12. the value in the register 36a of ttie system 

IS control microcomputer 36 is read, and the correspond- 
ing percentage x% is selected from ttie table ROM 6a. 

If the percentage x determined in this way is 0, i.&, 
when the number of btt& B1 availal^le for ttie first alloca- 
tion is 0, then the bit allocation accord^ig to the ratio of 

20 masking threshold to quantized noise will not be per- 
formed, and ttie processing proceeds directly to Step 
p18, which wai be discussed below. In contrast, if ttie 
number of bits B1 available for the first alfocation is not 
0. then Step pi 3 is carried out. 

25 In Step pi 3. given a total nun^r of mirN-disc audb 
spectrum data bits 60 (1 .144 to 1.464 txts). ttie number 
of bits 61 available for ttie first allocation in accordance 
witii ttie ratio MNRi(n) is calculated: B1 « BO x (x/100) . 
In Step pi 4. in accordance witti previously-cakxi- 

30 lated masking characteristics and minimum limit of aucfi- 
bBity characteristics corresponding to tiie aural- 
psychological characteristics of persons witti typical 
hearing, a masking threshold. i.e.. the curve a1. a12. 
a13. a3. a4. is calculated. Then, in Steps p15 and p16. 

35 as in Steps p5 and p6 above, the ratio of masking 
ttireshold to power of quantized noise MNRi(n) for each 
frequency band is calculated from the ratio SMRI of ttie 
frequency band*s spectrum power Si to its masking 
ttireshold Mi. In Step pi 7. bit allocation is performed in 

40 ttie same way as in Step p7 above, txit ttie total number 
of bits allocated In Step p17 is ttiejaumber of bits avaa- 
able for ttie first atlocation B1 , as calculated in Step p12 
above. 

In Step pis, the power of quantized rx)ise SNi(n) is 
45 calculated, and in Step pi 9. bets are allocated to ttie fre- 
quency band with the htgfiest pwer of quantized noise 
8Ni(n). Thereafter, the power of quantized noise 8Ni(n) 
Is re-cakxjiated. tnts are allocated to the band where 
this value is highest and this is repeated intil all txts 
so available for the second allocation B2 1= BO (1- (x/100)) 
have been allocated. Steps p18 and p19 are carried out 
when the number of bits available for the first alfocation 
is 0 and the processing has proceeded directiy from 
Step p12 to Step pIS, or when xdoes not equal 100. but 
55 when X does equal 100. i.e.. when 82 e 0. the word- 
lengtti is outputted directiy after Step pi 7. 

In cases when the Input signal is a corr^x^site wave 
of a sine wave signal and white noise, and in other 
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cases where H resembles a single sine wave, fa exam- 
pie with a soto piano piece, if the bit allocatbn is per- 
formed only according to the ratio of masking threshold 
to quantized noise MNRi(n), many bits will be allocated 
to noise elements with low power, and the error in quan- 
tizing of the piano becomes relatively great. However, if 
the bit allocation percentage x can be changed as out- 
lined above, the bit allocation according to the power of 
quantized noise SNi(n) is carried out in addition to that 
according to the ratio of masking threshold to quantized 
noise MNRi(n). thereby ensuring that the number of bits 
allocated to the piano can be increased, the error in 
quantizing of the piano is reduced. 

Again, if the input signal is composed of sound with 
many local peaks and noise, for example an orchestra 
piece, the bit allocation can be performed in accordance 
with the ratio of masking threshold to quantized noise 
MNRi(n). in which the noise and the musical tones com- 
posing small local peaks in bands ciose to large signals 
can be masked, thus allocating no bits to them, and 
more bits can be alkx^ted to large signals which are not 
masked. This enables high fidelity recording. 

Further, with input signals lying between the forego- 
ing two examples, whk:h are composed of a musical 
tone with three or four local peaks and noise, for exam- 
ple a sofo clarinet piece, by giving weight both to the bit 
allocatfon according to the ratio of masking threshold to 
quantized noise MNRi(n) and to the bit allocation 
according to the power of quantized noise SNi(n), fidel- 
ity of the clarinet can be improved. 

In this way. the bit allocation mettiod most suited to 
any musfoal tone source can be selected. 

The third embodiment of the present invention will 
next be explained, in reference to Figures 5 and 6. 

Figure 5 is a f fow chart for explaining the bit alloca- 
tion mettiod in tiie third enixxfiment of the present 
inventfon. The notable feature of this bit allocation 
method is ttiat the percentage x of (a) ttie bit allocation 
according to the ratio of masking threshoki to quantized 
noise MNRi(n) to (b) the bit aifocation according to the 
power of quantized noise SNi(n) is automatically deter- 
mined on the basts of the relationship between (1) 
peaks and focal peaks in spectrum powers Si and (2) 
masking thresholds. 

Rrst, the peak value among tiie spectrum powers 
of all frequency bands from 81 to SI. such as ttiat shown 
by reference symbol S5 on Figure 6. is found. Then a 
masking threshold, such as ttiat shown on Figure 6. 
which includes masking characteristics due to ttiat peak 
level, is found. Next, focal peaks such as ttiat shown by 
reference syirtei S8 on Figure 6 are found for each fre- 
quency band. The number of such focal peaks masked 
by the masking threshofo. and tiie number of such local 
peaks not so masked, are respectively found, and ttie 
ratfo between masked focal peaks and unmasked focal 
peaks determines ttie percentage x. 

In ottier words, if ttie total number of local peaks is 
NM. and the number of masked local peaks is M. ttien: 



M/(NM4.1)bO (1) 

Accordingly, if there are no masked local peaks, ttie per- 
centage x will be 0%. and ttie number of bits available 
5 for the first allocation B1 will be set at 0. If. on ttie ottier 
hand, 

0 < M/(NM+1) < 0.5 (2) 

TO tfien x is from 50% to 90%. and if 

0.5 < M/(NM^-1) (3) 

ttien x is 100%, and the number of bits avail^le for ttie 
IS first allocation 61 will be ttie entirety of total available 
bits BO. 

Here ttie detection of local peaks and the selection 
of ttie percentage x will be discussed. The focal peaks 
are found for all the frequency bands after ttie peak 

20 spectrum power (S5 in Rgure 6) is found. In the exam- 
ple in Figure 6, ttie difference D34. D45 D89 and its 

polarity between each spectrum power S3 to S9 wittiin 
a certain nuni)er of frequency bands from peak value 
85 (in the example in Figure 6. two frequency t)ands on 

25 the low-frequency side and four on the high-frequency 
side) is found, and ttie focal peaks are detected on ttie 
basis of change in polarity and ttie absolute value of 
ttiose differences. In tiis way. the focal peaks are found 
aaoss all frequency bands. In concrete terms, in ttie 

30 case of Rgure 6. there is only one local peak (S8). and 
ttiat focal peak is masked by ttie masking threshold; 
ttius M/(t^-i-1) B 1/(U1)»0.5. Accordingly, equation 
(2) above will be applied, and a percentage x « 50% to 
90% will be selected. 

35 Next, the bit allocation mettiod of ttie third embodi- 
ment will be explained in reference to Figure 5. 

In ttiis bit aifocation mettiod, after the spectrum 
power Si of each frequency band has been calculated In 
Step p2l as in Step p11 and Step pi above, ttie peak 

40 value is found In Step p22, and ttie masking ttireshdd 
including ttie masking charactecstics pf ttiat peak value 
is found in Step p23. In Step p24. ttie percentage x is 
calculated by means of equations (1) ttirough (3) above, 
and ttie number of bits available for ttie first aifocation 

45 B1 IS calculated. Then, in Steps p25 through p27, as in 
Steps pi 5 tt¥Ough p17 above, ttie first bit allocation, 
according to ttie ratio of masking ttireshold to quantized 
noise MNRi(n). is performed, and ttien in Steps p28 and 
p29. as in Steps pi 8 and p19 above, ttie second bit alio- 

so cation, according to ttie power of quantized noise 
SNi(n), is performed. 

In ttiis way. a bit allocation witti high sound quality 
appropriate to the musical tones like that shown in Rg- 
ure 4 can be performed automatically, and deterioration 

55 of sound quaGty. even with respect to musical tones not 
suited to ttie bit allocation according to tiie ratio of 
masWng tfireshofo to quantized noise MNRi(n), can be 
prevented. 



7 



13 



EP0855 805A2 



14 



The foregoing explains the case in which the per- 
centage X is calculated by means of equations (1) 
through (3) above (the prefeaed case), but the present 
invention need not be linited to this case. A similar 
effect may be obtained by calculation of the percentage 
X by means of equations (1) and (3). 

The change of the minimum limit of audibility and/or 
masking characteristics shown in Figures 1 and 3 above 
may also be applied to the bit allocations shown in Fig- 
ures 4 and 5. 

The concrete entxxiiments and examples of tnple- 
mentation discussed in the foregoing detailed explana- 
tions of the present invention serve solely to Illustrate 
the technical details of this invention, and the present 
invention should not be nan-cwiy interpreted within the 
limits of such concrete examples, but rather may be 
applied In many variations without departing from the 
spirit of this invention and the scope of the patent claims 
set forth below. 

Claims 

1. A method of encoding digital data comprising the 
steps of: 

(a) converting the digital data into a frequency 
spectmm. and dvkling the frequency spectrum 
into a plurality of frequency bands; 

(b) changing a minimum limit of audibility char- 
acteristic among aural-psychological charac- 
teristics so as to set a masMng threshold; 

(c) finding a ratio of masking threshold to noise 
for each frequency band, based on power or 
energy of each frequency txind; and 

(d) allocating a nuni^ of quantized bits to 
each frequency band in accordance with the 
ratio of masldng threshold to noise. 



(c) finding the ratio of masking threshold to 
noise fa each frequency barxj. teased on 
power or energy of each frequency band; and 

(d) allocating a number of quantized bits to 
5 each frequency band in accordance with the 

ratio of masking threshold to noise. 

4. The method of encoding digital data according to 
Claim 3. wherein: 

10 

in said step (b). after change of the masking 
characteristic, the minimum limit of audibility 
characteristic among aural-psychological char- 
acteristics is changed so as to set the masking 
15 threshold. 

5. The method of encoding digital data according to 
Claim 3. wherein: 

20 in said steps (c) arxJ (d). the numt^r of the 

quantized bits is maeased incrementally from 
0, and. at each increase, the ratb of masking 
threshold to noise is found for each frequency 
band, and bits are allocated to a frequency 

25 t>and having the smaDest ratfo of masking 

threshold to noise 

6. The method of encoding digital data according to 
Oaim 4, wherein: 

so 

in sakJ steps (c) and (d), the nunft)er of the 
quantized bits is increased incrementally from 
0, and. at each increase, the ratfo of masking 
threshold to noise is found for each frequency 
35 band, and bits are allocated to a frequency 

band having the smallest ratio of masking 
threshold to noise 



2. The method of encoding digital data according to 
Claim 1, wherein: 40 

\n said steps (c) and (d). the number of the 
quantized txts is increased incrementally from 
0, and. at each increase, the ratio of masking 
threshofo to noise is found for each frequency 45 
band, and bits are allocated to a frequency 
t>and having the smallest ratio of masldng 
threshoki to noise. 

3. A method of encoding digital data comprising the so 
steps of: 

(a) converting the c£gital data nto a frequency 
spectnim. and dividing the frequency spectrum 
into a plurality of frequency bands; 55 

(b) changing a masking characteristfo anx)ng 
aiiral-psydx)(ogical characteristics so as to set 
a masking threshokJ; 



7. A method of encoding digital data In which the dig- 
ital data is converted into a frequency spectrum, the 
frequency spectrum is divided into frequency 
bands, and quantized bits are altocated to each fre- 
quency band, said metfiod conprising the steps of: 

setting a percentage for allocation of the quan- 
tized bHs, tiie percentage t>elng changeable; 
and 

(0 performing a first allocation of ttie quan- 
tized bits in accordance with ratios of 
masking threshold to noise which are 
found for each frequency band in accord- 
ance with power or energy of each fre- 
quency t>and in consideration of aural- 
psychological characteristics; and 
(n) performing a second altocation of the 
quantized bits in accordance witti a repre- 
sentative value of the power or energy of 
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each frequency band; 

wherein said steps fi) and (ii) are per- 
formed in accordance with the percentage so 
as to allocate the total number of quantized s 
bits, thereby allocating the number o1 the quan- 
tized bits of each frequency band. 



8. The method of encoding digital data according to 
Claim 7. wherein: 



10 



formed in accordance wHh a ratio of masking 
threshold to noise of each frequency band in con- 
sideration of aurai-psychdogical characteristics, 
the ratio being calculated for each frequency band 
in accordance with power or energy of each fre- 
quency k)and. wherein: 

masking characteristics that have been previ- 
ously found in accordance with the power or 
the energy are changeable. 



the percentage is determined in accordance 
with a relationship between the masking 
threshold and peaks and local peaks found 
based on differences in power or energy is 
between adjacent spectra within each fre- 
quency barxl. 

9. The method of encoding digital data according to 
Claim 7, wherein: so 

the percentage corresponds, if NM is the total 
number of the local peaks, with a ratio of the 
number M of the kx»l peaks which are masked 
by the masking threshold to the nunt>er N of 25 
the local peaks which are not masked by the 
masking threshold. 

10. The method of encoding digital data according to 
Claims, wherein: so 

the percentage is set to 0% when 
M/(NM+1) B 0 is satisfied: and 
the percentage is set to 100% when 
0.5 < M/(NM^1) is satisfied. ss 

11. A method of encoding digital data in which, in con- 
vertng digital data such as musical tones and 
sounds into frequency domains, dividing the con- 
verted spectra into a plurality of frequency bands, 40 
and altocating bits for each frequency band so as to 
encode the cfigital data, the allocating of bits Is per- 
formed in accordance with a ratfo of masking 
threshold to noise of each frequency band in con- 
skJeration of aural-psychological characteristics. 45 
the ratio being catenated for each frequency band 

in accordance with power or energy of each fre- 
quency l>and, wherein: 



13. A method of encoding digital in which digital data 
such as musical tones and sounds is converted into 
frequency domains, the converted spectra are 
divided irrto a plurality of frequency tands. and bits 
are allocated for each frequency band so as to per- 
fomri the encoding, said method comprising the 
steps of: 

(i) performing a first allocation of the quantized 
bits in accordance with ratios of masking 
threshold to noise wfuch are fbund for each fre- 
quency band in accordance with power or 
energy of each frequency band; 

(ii) performing a second allocation of the quan- 
tized bits in accordance with a representative 
value of the power or the energy of each fre- 
quency band: and 

(iii) performing a third allocation of the quan- 
tized bits giving weight to the bit allocation 
methods of each of said steps (i) and Cii); 

wherein said steps (i). (fi), and (iii) are 
switchaUe. 

14. The method of encoding digital data according to 
Claim 13. wherein: 

saki steps (i), (ii). and (ill) are switched in 
accordance with a relationship between the 
masking ttveshoM and peaks and focal peaks 
found t>ased on differences in power or energy 
between acijacent spectra within each fre- 
quency band. 



minimum limit of audbility charactertstx:s that so 
have been prevfously found are changeable. 

12. A method of encoding digital data in which, in con- 
verting digital data such as musical tones and 
sounds into frequency domains, divkfing the con- ss 
verted spectra into a plurality of frequency bands, 
and altocating bits for each frequency band so as to 
encode the digital data, the allocating of bits is per- 
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